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Recovering a Clipped Signal in Sparseland 

Alejandro J. Weinstein and Michael B. Wakin 
Abstract 

In many data acquisition systems it is common to observe signals whose amplitudes have been 
clipped. We present two new algorithms for recovering a clipped signal by leveraging the model as- 
sumption that the underlying signal is sparse in the frequency domain. Both algorithms employ ideas 
commonly used in the field of Compressive Sensing; the first is a modified version of Reweighted l\ 
minimization, and the second is a modification of a simple greedy algorithm known as Trivial Pursuit. An 
empirical investigation shows that both approaches can recover signals with significant levels of clipping. 

Index Terms 

Restoration, signal clipping, sparsity, compressive sensing. 

I. Introduction 

In many practical situations, either because a sensor has the wrong dynamic range or because signals 
arrive that are larger than anticipated, it is common to record signals whose amplitudes have been clipped. 
Any method for restoring the values of the clipped samples must — implicitly or explicitly — assume some 
model for the structure of the underlying signal. For example, one of the first attempts to "de-clip" a 
signal was the work of Abel and Smith (T), who assumed that the underlying signal had limited bandwidth 
relative to the sampling rate (i.e., that it was oversampled) and recovered the original signal by solving 
a convex feasibility problem. Godsill et al. later tackled the de-clipping problem using a parametric 
model and a Bayesian inference approach. Along the same lines, Olofsson [3] proposed a maximum 
a posteriori estimation technique for restoring clipped ultrasonic signals based on a signal generation 
model and a bandlimited assumption. 

Meanwhile, recent research in fields such as Compressive Sensing (CS) flU has shown the incredible 
power of sparse models for recovering certain signal information. Many signals can be naturally assumed 
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to be sparse in that they have few non-zero coefficients when expanded in a suitable basis; the name 
"Sparseland" has been informally used to describe the broad universe of such signals (3). Although a 
typical CS problem involves an incomplete set of random measurements (as opposed to a complete — 
but clipped — set of deterministic samples), sparse models have made a limited appearance in the de- 
clipping literature. In particular, Gemmeke [6] et al. imputed noisy speech features by considering the 
spectrogram of the signal as an image with missing samples, represented the spectrogram in terms of 
an overcomplete dictionary, and used sparse recovery techniques to recover the missing samples. Using 
the model assumption that the underlying signal is sparse in an overcomplete harmonic dictionary, Adler 
et al. d later adapted the Orthogonal Matching Pursuit (OMP) [5] recovery algorithm from CS into a 
de-clipping algorithm that they call constraint-OMP. 

In this paper, we present two methods for de-clipping a signal under the assumption that the original 
signal is sparse in the frequency domain, i.e., that it can be represented as a concise sum of harmonic 
sinusoids. This model is general enough to embrace a wide set of signals that could be recorded from 
certain communication systems, resonant physical systems, etc. This model is also commonplace in the 
CS literature, particularly in settings where random measurements are collected in the time domain. 

Although the measurements we consider are not randomly collected^ we do find that certain ideas 
from the field of CS can be leveraged. In particular, we have modified several CS algorithms in an attempt 
to account for the clipping constraints. Among the methods that we have tried, the two with the best 
performance are a modified version of Reweighted l\ minimization [8] and a modified version of the 
Thresholding algorithm J5j, also known as Trivial Pursuit (TP) [9]. This is surprising since TP, a very 
simple greedy algorithm, is one of the poorest performing algorithms in conventional CS problems [5]. We 
also show that, when tested on frequency sparse signals, these two methods outperform constraint-OMP. 

II. Preliminaries 

A. Problem definition 

Let x G R N be a K -sparse signal in the Fourier domain, i.e., x = and ||a||o = K, where ^ is 
the N x N inverse Discrete Fourier Transform (DFT) matrix and || • ||o denotes the number of non-zero 
entries of a vector. Because of the Hermitian symmetry property of real signals, the sparsity level K is 

'in fact they are "adversarial", in that clipping eliminates the samples with the highest energy content. 
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in general twice the number of harmonics in xj^] Let the clipped version of x be x c , where 

C u if x(n) > C u , 
Vein) = {Ci if x(n) < C h 
x{n) otherwise, 

and C u and C\ are the known upper and lower clipping values, respectively. Our goal is to recover the 
original signal x from the observed clipped signal x c . 

Denote by Q u and Qi the index sets of the upper and lower clipped samples, respectively, and by 
O nc the index set of the non-clipped samples: Q u = {n\x c (n) = C u }, f2; = {n\x c (n) = C{\, and 
O nc = {£l u U f^) c . Similarly, denote by ^ u and ^ the matrices formed with the rows i G fl u and 
j G Qi of ^, respectively. We can write the non-clipped values of x c as y = <&x, where $ is a restriction 
operator formed with the rows j G f2 nc of the N x N identity matrix. 

B. Basis Pursuit, Basis Pursuit with Clipping Constraints, and Reweighted l\ with Clipping Constraints 

The canonical CS method for recovering a sparse signal is known as Basis Pursuit |5]. Given a set of 
non-clipped linear measurements y = $x = $*a, Basis Pursuit involves solving the following convex 
optimization problem: 

q = argmin ||Wa||i s.t. = y, (BP) 

where W is a diagonal weighting matrix with the norm of the columns of in its main diagonal and 
zeros elsewhere. In the de-clipping problem, we also know that samples clipped by the upper limit must 
have values greater or equal than C u , and samples clipped by the lower limit must have values smaller 
or equal than C\. We can then propose a version of Basis Pursuit with clipping constraints: 

a = argmin || Wa||i 

aeC « (BPCC) 

s.t. = y, ^ u a > C u , < Q. 

Another technique commonly used in CS is "Reweighted l\ minimization" [8]. In its original formu- 
lation, this method iterates over a weighted version of ( |BP| ), adjusting the weights based on the solution 
obtained in the previous iteration. This method typically has better signal recovery performance than 
Basis Pursuit but at the expense of a higher computational load. We adapt this method to the de-clipping 
problem by replacing ( |BP| > at each iteration with ( |BPCC| ). We dub this method Reweighted l\ with 
Clipping Constraints (R^iCC). Algorithm [T] shows the complete method. 

2 The exceptions are harmonics of frequency or 7r, which contribute only one DFT coefficient each. 
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Algorithm 1 Reweighted l\ minimization with clipping constraints (RiiCC) 
input: $, *, * u , y, d, C v , £ 

1=1, W t w = l,i = l,...,N 
repeat 

a™> = argmin \\W^ 'ot\\i 

s.t. = y, *„a > C u , t> t a < Ci 

1=1+1 

until I > W + 1 or \\a {l) - a^" 1 )]^ < 5 
output: c/ _1 



We test these three approaches with the signal x(n) = sin {2im/N + vr/4) for iV = 128. Figure 1 'a) 
shows the result for a clipping level of ±0.75, at which there are 70 non-clipped samples, and Fig. 1 b) 
shows the result for a clipping level of ±0.72, at which there are 66 non-clipped samples. These numbers 
of non-clipped sample^] correspond to the transition between the recovery and non-recovery zones of 
operation of ((BP]) and flBPCQ . 



In this experiment and in others (see Sec. IV I, we observe that adding clipping constraints to Basis 



Pursuit does not help to perfectly recover signals with lower clipping thresholds. RiiCC, on the other 



hand, can recover signals with more significant levels of clipping. This improvement of RiiCC over ( |BP| > 
and (BPCCl is actually substantially better than is typically observed in CS (H. 



Thinking in terms of CS principles, the Restricted Isometry Property (RIP) [4J is commonly used for 
theoretical analysis of compressive measurement operators. The RIP can be shown to hold with high 
probability for a randomly generated matrix with a small number of rows, and when it holds, such a 
matrix can be used to exactly recover any sparse signal (up to a certain sparsity level). This perspective 
is not the right one to analyze the de-clipping problem, however, and it cannot be used to explain why 



( |BP[ ) and ( |BPCC[ ) fail in the previous example. First, since the matrix is not random, we cannot use 
any of the standard probabilistic tools to predict whether it will satisfy the RIP. Second, while the RIP 
guarantees that a fixed measurement matrix can be used to recover any sparse signal, in the de-clipping 
problem the matrix <&\I' is relevant only for the small set of signals that, when clipped, actually produce 
the samples given by this matrix. In other words, <I> itself is dependent on the unknown signal x. This 
dependency is not only unusual in CS, it is also contrary to what makes a measurement matrix favorable 
in CS: while random matrices tend to capture a representative sample of signal entries, both large and 

3 Due to the nature of this signal x(n), it is not possible to set the clipping level so that the number of non-clipped samples 
is between 66 and 70. 
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Fig. 1: Reconstruction ofx{n) = sin (2im/N + tt/4) by ((BP), ( jBPCq , and Algorithm 1 (RhCC). (a) Clipping level 
±0.75. All three approaches recover the signal, (b) Clipping level ±0.72. Only R£\ CC recovers the signal exactly. 



small, the clipping process deliberately excludes all of the large signal entries and keeps only the small 
ones. 

III. Trivial Pursuit with Clipping Constraints 

Let us note that the DFT of the clipped signal x c contains, in addition to the harmonics introduced by 
the clipping process, all of the harmonics present in the original signal x. Interestingly, the harmonics 
with the biggest magnitude typically coincide with the ones from the original signal. Figure [2] shows an 
example for a signal with sparsity level K = 10 clipped at an amplitude corresponding to 20% of its 
peak value. We see that the 5 biggest harmonics of x c are at the same locations as the 5 harmonics of 
x. Our second proposed de-clipping algorithm exploits this observation. 

The method is very simple and consists of two stages. First, we identify the support (the location of 
the non-zero Fourier coefficients) of the signal. Second, we estimate the value of the coefficients on this 
support using a least-squares approach, similar to that used in other greedy methods such as Matching 
Pursuit or OMP [5]. If we know the sparsity level K a priori, we can estimate the support simply by 
finding the K biggest harmonics of x c . In the more general case where we do not know K, we select the 
elements of the support one at a time in a greedy manner, until the reconstruction error on the non-clipped 
samples is small enough. 

Algorithm [2] shows a detailed description of the method. In the match step we compute the DFT of 
the clipped signal — this happens only once. Then we repeat the following steps until the residual r is 
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Fig. 2: Support estimation using the DFT of the clipped signal, (a) A signal x with sparsity level K = 10 and its 
clipped version x c , with C u /\\x\\oo = 0.2, corresponding to M = 40 non-clipped samples, (b) DFT of x and x c for 
< k < y ■ Note that the 5 biggest harmonics of x c are at the same locations as the harmonics of x. 

Algorithm 2 Trivial Pursuit with Clipping Constraints 
input: <t>, x c . y, e 
initialize: r = y, A (1) = 0, £ = 1 
match: h = DFT{x c } (indexed from to N - 1) 
while 1 1 r* ] 1 2 > e do 

identify: k = argmax 0< ^ < jv \h(j)\ 

A (e+D = A W u r kj (TV _ fc ) mod jYj 

h(k) = 

update: a = argmin z: supp(z) ca(«+D Hi/ ~ ^z\\ 2 
r = y — <E>vT/q: 

1 = 1+1 

end while 

output: x = *a = *argmin z: supp(z) caW Hi/ - 



arbitrary small (we use e = 10 in our experiments). In the identify step we add the indices associated 
with the current largest harmonic to the support index set A, and we then set those coefficients to zero to 
avoid selecting them again in the next iteration. In the update step we compute the DFT coefficients of a 
signal — restricted to the support A — that best approximates the non-clipped samples y in a least-squares 
sense. Note that the coefficients «a on this support are easily computed as oca = (&^) A y, where ($'1')^ 
is the pseudoinverse of the columns of indexed by A. 

Although perhaps not evident at first sight, Algorithm ^corresponds to a modified version of the method 
known as Trivial Pursuit (TP) 0, 0. Given a set of non-clipped linear measurements y = <3?x = ^^a, 
TP would estimate the support of a simply by computing the score Iitp = (^^) T y and selecting 
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Fig. 3: Recovering a two-tone signal using TPCC. The 
clipping level C u = 0.7 is just below the high-frequency 
"bumps". It is possible to recover the signal with a clipping 
level down to C u = 0.2. We set the signal length to 
N = 128. 



Fig. 4: Recovering a clipped signal using BP, BPCC, 
constraint-OMP, Rt x CC, and TPCC. We plot the average 
minimum number of non-clipped samples M m i n required 
to recover signals of different sparsity levels K. 



the indices of the largest entries of hrp- We can write hxp = {&^) T y = ^ T & T y and note that the 
vector <& T y € corresponds to a zero-padded version of y with the non-clipped samples at the proper 
locations. Since multiplying by ^ T is equivalent to computing the DFT of a vector, hxp is in fact the 
DFT of the zero-padded version of y. The vector h computed in the match of Algorithm [2] is actually 
very similar to hxp, except that instead of computing the DFT of the zero-padded version of y, we 
compute the DFT of x c , which is equal to y padded with the clipped values instead of zeros. In other 
words, Algorithm [2] exploits the knowledge of the clipped values. For this reason we dub our method 
TPCC. 

To illustrate the effectiveness of TPCC we experiment with the signal x(n) = sin(27rn//V)-r-0.25 sin(27r3n/iV) 
of length N = 128. We clip this signal, shown in Fig. [3} at a level just below the "bumps". It might seem 
impossible to recover the signal once the oscillations due to the third harmonic are missing. Remarkably, 
however, TPCC not only recovers this signal at the clipping level of C u = 0.7, but it can even recover 
this signal down to the clipping level of C u = 0.2, at which point there are only 10 non-clipped samples. 

Although TP is arguably the simplest reconstruction method for sparse signals in CS, it also is one of the 
poorest performing in terms of the number of measurements required for successful signal recovery |5|. 



For this reason it is quite surprising that in this experiment and in others (see Sec. IV I TPCC can be so 
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Fig. 5: Recovering a clipped signal using RiiCC and 
TPCC. We plot the probability of perfect recovery as a 
function of the sparsity level K for M = 70 non-clipped 
samples. 




5 10 15 20 25 30 35 40 
Sparsity level K 

Fig. 6: Recovering a clipped signal using TPCC. We plot 
the probability of perfect recovery as a function of the spar- 
sity level K for different numbers of non-clipped samples 
M. 



effective for de-clipping sparse signals. 

IV. Experimental Results 

In this section we empirically evaluate the methods described previously. We also compare with 
constraint-OMP^ For all experiments that follow we generate, for each value of the sparsity level K, 
signals of length N = 128 having K non-zero coefficients with frequencies selected randomly, amplitudes 
chosen randomly from a uniform distribution between 0.5 and 1.5, and phases selected randomly]^] 

In the first experiment, we find the average minimum number M m i n of non-clipped samples required 
to recover a signal as a function of K. We compute the average over 100 simulation runs. Figure [4] 
shows the results. BP and BPCC perform very poorly, being unable to recover the original signal except 
when the clipping is very mild. Constraint-OMP performs better, while TPCC and RiiCC perform much 
better still. In fact, both TPCC and RiiCC can reliably recover the signal using a number of non-clipped 
samples that is not much larger than K, while BP and BPCC require a number of non-clipped samples 
much closer to N. 

4 We have found that constraint-QMP exhibits better performance with signals sparse in the Direct Cosine Transform (DCT) 
domain than with signals sparse in the DFT domain. We thus use the DCT as the sparsity basis for testing this method. 
5 MATLAB code is available at https://github.com/aweinstein/declipping. 
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In the second experiment, we compare R£\CC and TPCC in a different way. We fix the number of non- 
clipped samples to M = 70 and plot the probability of perfect recovery (declared when \\x — x\\ < 10 -3 ) 
as a function of the sparsity level K. Figure [5] shows the results using 100 trials for each combination 
of M and K. Although RiiCC performs somewhat better than TPCC, it is important to underscore that 
TPCC requires significantly fewer computations. 

In the final experiment, we examine the performance of TPCC more closely. We plot the probability 
of perfect recovery as a function of K for different values of M. Figure [6] shows the results using 500 
simulation runs for each combination of M and K. As expected, we the probability of recovery increases 
as the sparsity level K decreases, and as the number of non-clipped samples M increases. Again, in 
general, we can expect a high probability of recovery from TPCC (over our random signal model) when 
M is a small multiple of K. 

V. Conclusions 

We have presented two methods for restoring a clipped signal using the model assumption of sparsity in 
the frequency domain. Both techniques can be extended to account for noise, and preliminary experiments 
on noisy signals are promising although space limitations prevent us from discussing this more deeply. 
One of our methods, TPCC, is particularly simple to implement; its running time is dominated by the 
computation of the DFT and the solution of the least-squares problem, and it is significantly faster than 
R^iCC. 

Our algorithms are inspired by existing techniques from the field of CS, and the performance we 
achieve (where the requisite number of non-clipped samples M scales with K) is fully in line with 
the state-of-the-art performance in CS. This is in spite of the fact that standard RIP analysis does not 
apply to the de-clipping problem and that the measurement operator in our problem is non-random and 
signal-dependent. Insight from CS would suggest that this signal dependence could be catastrophic for 
standard sparse approximation algorithms. Thus, we believe that further work is warranted to understand 
why even a simple algorithm such as TPCC can succeed in the de-clipping problem when much more 
complicated algorithms are required in CS. 
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